-
-
Notifications
You must be signed in to change notification settings - Fork 377
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
parallel ST_Union #698
parallel ST_Union #698
Conversation
Gotcha. That makes sense, I guess. The same behaviour would be available by going ST_Union(ST_Collect(ST_Buffer(...))) right? But this allowed the aggregate to "parallelize" and thus pick up efficiencies inside by default. +1 |
@pramsey I get and error when trying this query, am I missing something?
|
Oh interesting, I guess we don't have a unary version of ST_Union (other than ST_UnaryUnion) |
Maybe the same thing but like this would work: ST_Union(array_agg(ST_Buffer())) ... basically there's a version of union that takes an array. |
It looks like array_agg cannot be executed in parallel:
Grepping Postgres source, there are only |
That's very interesting. A little low-hanging core PgSQL performance fruit. PgSQL core devs don't believe that individual functions can ever have much cost, compared to i/o, so they leave this stuff lying around. |
there is a patch that implements parallel support for array_agg that was written back in 2017 https://www.postgresql.org/message-id/flat/CAKJS1f9sx_6GTcvd6TMuZnNtCh0VhBzhX6FZqw17TgVFH-ga_A%40mail.gmail.com |
Very subtle. |
@pramsey I am wondering if possibility of allowing to specify custom allocator in GEOS was already discussed somewhere. Making GEOS to use palloc would unlock further improvement in parallel ST_Union. |
Now that |
Anything that was ever exposed via SQL API that needs to be removed goes to https://github.com/postgis/postgis/blob/master/postgis/postgis_legacy.c#L25 - you have to keep the signature for flawless binary-only system upgrades. |
deserialfn must be STRICT for Postgres=13.1 compatibility
The github action failure can be ignored. I think that is something to do with GDAL change. The drone failure looks like a real issue because of missing drop of long deprecated function
|
GDAL issue is already fixed: OSGeo/gdal#5946, just need to rebuild test environment. |
On Tue, Jun 21, 2022 at 01:40:13AM -0700, Sergei wrote:
Now that `pgis_geometry_union_finalfn()` function is not used, should I mark it somehow or just remove it?
If it can be removed, I can drop "\_parallel\_" from my function names.
You can drop it in the postgis/postgis_after_upgrade.sql file
Make sure to mark the AGGREGATE definition as changed, in some way,
so that create_upgrade.pl can deal with it.
We need all upgrade kinds to work smoothly, but not all upgrade paths
are tested by our bots.
See https://git.osgeo.org/gitea/postgis/postgis/pulls/30 as an early
way to test upgrades in presence of views using aggregates.
The new way is creating those views in
regress/hooks/hook-before-upgrade.sql
https://git.osgeo.org/gitea/postgis/postgis/src/branch/master/regress/hooks/hook-before-upgrade.sql
And drop them in
regress/hooks/hook-after-upgrade.sql
https://git.osgeo.org/gitea/postgis/postgis/src/branch/master/regress/hooks/hook-after-upgrade.sql
Happy testing !
|
and
|
Simple parallel ST_Union implementation:
transfn
just adds the values to a list,combinefn
concatenates the lists from differentworkers, and cascaded union is still fully calculated in
finalfn
. This doesn't affect execution time of ST_Union itself, but allows toparallelize queries like
SELECT ST_Union(ST_Buffer(...)) ...
.